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Abstract — In many complex engineered systems, the ability 
to give an alarm prior to impending critical events is of great 
importance. These critical events may have varying degrees of 
severity, and in fact they may occur during normal system opera- 
tion. In this article, we investigate approximations to theoretically 
optimal methods of designing alarm systems for the prediction of 
level-crossings by a zero-mean stationary linear dynamic system 
driven by Gaussian noise. An optimal alarm system is designed 
to elicit the fewest false alarms for a fixed detection probability. 
This work introduces the use of Kalman filtering in tandem with 
the optimal level-crossing problem. It is shown that there is a 
negligible loss in overall accuracy when using approximations to 
the theoretically optimal predictor, at the advantage of greatly 
reduced computational complexity. 

Index Terms — Optimal alarm theory, Level-crossing theory, 
Kalman prediction 

I. Introduction 

T HIS article introduces a novel approach of combining 
the practical appeal of Kalman filtering with the design 
of an optimal alarm system for the prediction of level- 
crossing events. A comprehensive demonstration of practical 
application for the design of optimal alarm systems has been 
covered in the literature [1], [2], [3]. However, the background 
theory for optimal alarm systems has seen modest coverage 
by other authors as well [4], [5], [6], [7], The latter is by no 
means a comprehensive list, but illustrates a cross-section of 
the primary authors responsible for introducing optimal alarm 
systems in a classical and practical sense. 

It was shown by Svensson [1], [2] that an optimal alarm 
system can be constructed by finding relevant alarm system 
metrics (as are used in ROC curve analysis) as a function of a 
design parameter by way of an optimal alarm condition. The 
optimal alarm condition is fundamentally an alarm region or 
decision boundary based upon a likelihood ratio criterion via 
the Neyman-Pearson lemma, as shown in [5], [6]. This allows 
us to design an optimal alarm system that will elicit the fewest 
possible false alarms for a fixed detection probability. This be- 
comes important when considering the numerous applications 
that might benefit from an intelligent tradeoff between false 
alarms and missed detections. 

Due to the fact that optimal alarm regions cannot be 
expressed in closed form, one of the aims of this study is 
to investigate approximations for the design of an optimal 
alarm system. The resulting metrics can easily be compared 
to competing methods that may also provide some level of 
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predictive capability, but have no provision for minimizing 
false alarms for the prediction of level-crossing events. 

There are several examples of level-crossing events to be 
studied, varying from a simple one-sided case to a more com- 
plicated two-sided case. The former one-sided case involves 
exceedances and/or upcrossings of a single level spanning two 
adjacent time points for a discrete-time process. This is the 
case that has traditionally been studied in previous work and 
invokes ARMA(X) prediction methods [1], [2], [4], [5], [6], 
[7]. The latter two-sided case involves a level crossing event 
that may span many time points and exceed upper and lower 
levels symmetric about the mean of the process many times 
during this timeframe. 

A variant of the latter more complicated two-sided case has 
been investigated by Kerr [8] and uses a Kalman filter-based 
approach. The two-sided case is more practically relevant 
when monitoring residuals that may be derived from the 
output of other machine learning algorithms or transformed 
parameters that relate to system performance. We investigate 
the two-sided case here, and also use a Kalman filter-based 
approach in an optimal manner relevant for the prediction of 
level-crossings. 

The prediction of such a level-crossing event is also very 
similar to what has been established as the state of the art for 
newly minted spacecraft engines, as studied in [9], however 
no guarantees of optimality exist. This provide us additional 
practical motivation for investigating a level crossing event that 
spans many time points, moving beyond what has previously 
been studied in this vein. In general, the design of optimal 
alarm systems demonstrates practical potential to enhance re- 
liability and support health management for space propulsion, 
civil aerospace applications, and other related fields. Due to 
the great costs, not to mention potential dangers associated 
with a false alarm due to evasive or extreme action taken 
as a result of false indications, there are great opportunities 
for cost savings/cost avoidance, and enhancement of overall 
safety. Nonetheless, our intent is to demonstrate the utility 
of optimal level-crossing prediction from a more theoretical 
perspective. 

There is an extensive history of invoking Kalman-filter- 
based approaches within the failure detection literature. A few 
of the most groundbreaking articles that discuss the use of 
Kalman filter methods for failure detection have been authored 
by Kerr [8], and Willsky and Jones [10]. Both of these 
articles have a long history of related methods descending 
from them, i.e., [11] which alludes to the use of the Neyman- 
Pearson lemma. More recently, the use of the Kalman filter 
has been used to address the level-crossing prediction problem 
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in application to condition monitoring [12], however without 
any theoretical guarantees of optimality. A competitor to the 
optimal alarm system is described in [13], and uses adap- 
tive optimal on-line techniques in a Bayesian formulation, 
providing more modeling flexibility. However, there are still 
considerable computational issues with such an approach, and 
a well-defined cost function is still required, even when the 
posterior probability is adaptively updated. 

One recent criticism of [10] addresses the claim of its 
optimality by Kerr [14]. This method, presented by Willsky 
and Jones, is characterized by a formulation of the failure 
detection problem involving the GLR (Generalized Likelihood 
Ratio) test. The method derived by Kerr shows how to derive 
a failure detection algorithm whose design is performed by 
computing false alarm and correct detection probabilities over 
a time interval. Neither method is optimal in the sense used 
to predict level-crossings, as introduced by DeMare [5] and 
Svensson et al. [2], Other standard methods based upon the 
GLR test, and SPRT (Sequential Probability Ratio Test) invoke 
hypothesis tests that are geared more for detection of the 
change of model parameters, as opposed to level-crossings. 

As was previously mentioned in this section, we aim to 
more precisely close the gap between the use of Kalman 
filtering and optimal alarm systems in this article. Although 
this article is motivated by fault detection and prediction, 
and it is recognized that the literature in this area is quite 
expansive, our investigation aims to shed light on a segment 
of the literature that has been largely overlooked. 

II. Performance Analysis 

As mentioned previously, relevant alarm system metrics 
such as ones used in ROC curve analysis can be expressed as a 
function of a design parameter via an optimal alarm condition. 
These same metrics will act as the basis for comparison 
to competing methods that are functions of different design 
parameters. These competing methods may provide some level 
of predictive capability, but have no provision for minimizing 
false alarms. The two primary methods to provide a baseline 
for comparison are to compare the process value with a 
fixed threshold, or the “redline” method, and to compare 
future predicted process value with a fixed threshold, or the 
“predictive” method. 

However, in both cases it is important to make the distinc- 
tion between the critical level, L, associated with the level 
crossing event to be predicted and the fixed threshold referred 
to above, denoted as La- The critical level represents the 
threshold above which damage or some significant decrease in 
quality of a behavior or process may potentially occur. There 
are some cases in which this critical level, L, is not known, 
have not been designed a priori , or when known critical levels 
yield alarm systems that are practically infeasible. As such, 
sometimes it is beneficial to use values that are based upon 
statistical outlier detection and hypothesis testing via the p- 
value. 

The fixed threshold. La, essentially acts as a design pa- 
rameter with which to tune the alarm system sensitivity. Its 
value is the level at which an alarm would be triggered, whose 


selection may be performed with the aid of ROC analysis. The 
main utility of using two distinct levels, however, is to enable 
the decoupling of alarm design from construction of the critical 
event itself. Two levels are also often used in practice for the 
design of fault detection algorithms that involve limit-based 
abort decisions. A “yellow-line” limit check is often used as 
a precursor caution and warning threshold to the “redline” 
abort threshold. The former can be used as an alarm system 
design parameter, where the latter may serve as a hard limit 
determined apriori via extensive experimental validation. 

To recap, the redline and predictive techniques both use 
fixed thresholds. La, and the optimal level-crossing predictor 
uses an optimal alarm condition (or approximations of it). 
All three techniques are leveraged to predict another distinctly 
more critical level-crossing event (based upon the critical level, 
L), and all are preferable to the use of a single level for a 
number of reasons. For one, ROC curve statistics (the true and 
false positive rates) can be expressed directly as a function of 
the model parameters when using these techniques. Therefore, 
design of the alarm system can proceed without the need to 
observe actual examples of failures, and there is no need to 
estimate the alarm system metrics empirically. This obviates 
the need to rely upon having actual available examples of 
failures for alarm system design to generate the ROC curve. 

It is not possible to construct an ROC curve as a function 
of model parameters when using a single level. In this case 
the ROC curve statistics can only be estimated empirically 
with observational and truth data. Truth data in this case 
can either be represented by model generated level-crossing 
events, or failures generated from a complex system. The 
construction of an ROC curve in this manner can be used for 
any alarm system technique. However, in the absence of actual 
observations of failures, the “Monte-Carlo” style method of 
generating truth data can be computationally intensive, and 
is still based purely upon simulated model-generated level- 
crossings. As such, it is imperative that the gap between 
model-generated failures and actual observations of failures 
be made as small as possible. The level-crossing event must 
sufficiently characterize an actual physical failure to realize 
the advantage of expressing the ROC curve of as a function 
of the model parameters, and thus to design an alarm system 
without the need to observe actual examples of failures. 

The redline, predictive, and optimal techniques are prefer- 
able to the use of a single level for another reason. The former 
three techniques generate ROC curve statistics that are based 
upon the use of distinct design spaces for construction of a 
critical event and their respective alarm systems, providing a 
measure of functional distinction. The critical event can be 
constructed such that multiple level-crossings span multiple 
time steps into the future, implicitly enabling a predictive 
assessment capability for alarm system design. Using a single 
level-based alarm system merges the functionality of limit 
checking and the use of an alarm design parameter. As such 
it is not possible to decouple independent alarm design from 
the critical event, and thus this method provides no measure of 
functional distinction. It is also the one most commonly found 
in the literature, i.e., [15], [8], [10]. Arguably, the critical event 
should be constructed to emulate the physics of the failure, and 
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the alarm system should independently be designed to predict 
it. The distinction between these two paradigms is one of the 
most discernable differences in the theoretical techniques used 
here and in other literature, [1], [2], [3], [4], [5], [6], 

III. Methodology 

A level-crossing event is defined with a critical level, L, 
that is assumed to have a fixed, static value. The level is 
exceeded by some critical parameter than can be represented 
by a dynamic process, and is often modeled as a zero-mean 
stationary linear dynamic system driven by Gaussian noise. 
Most of the theory that follows is based upon this standard 
representation of the optimal level-crossing problem. As such, 
our underlying assumption is that we can fit measured or 
transformed data to a model represented by a linear dynamic 
system driven by Gaussian noise. The state-space formulation 
is shown in Eqns. 1-3, demonstrating propagation of both 
the state, x^ £ K" which is corrupted by process noise 
Wfc £ R n , and the state covariance matrix, P^, which evolve 
with the time-invariant system matrix A. The output, y k £ ffi. 
is univariate, and is corrupted by measurement noise Vk £ R. 



= Ax fc + w k 

(1) 

Vk 

= Cx fc + v k 

(2) 

Pfc+l 

= AP k A^ + Q 

(3) 


where 

w fc ~ 7V(0, Q), Q^O 
Vk ~ Af( 0, R ), R > 0 
x 0 ~ A/'(/r x ,Po) 


A summary of the notation to be used henceforth is provided 
in Table I. As mentioned previously, there is great flexibility 
in constructing a mathematical representation for the level- 
crossing event, Ck . Ostensibly, the target application will drive 
the definition of this event. As such, in this paper the event 
of interest is shown in Eqn. 4, cf. Kerr [8] in consideration 
of the motivating factors described in the introduction. This 
level-crossing event represents at least one exceedance outside 
of the threshold envelope specified by [— L, L] of the process 
yk within the specified look-ahead prediction window, d. 


Ck = U Sk+j = U E' k+j =1" n E k+j 

3 = 1 3 = 1 3=1 


(4) 


where 


E 

S 



A 

k+j 

— 


A 

k+j 

— 


{\yk+j\ < L}, Vj > 1 

; e Uj 3 = i 
1 nd l E k+ i,E' k+j v?> i 


Fig. 1 illustrates the relationship between subevents S k +j 
and Ek+j, when d = 5. The event Ck can be represented 
as the union of disjoint subevents, Sk+j, or as the union of 


Mathematical 

Representation 

Nomenclature 


E [•] (Expectation) 

*k-\-j\k 

E[m\yo,...,y k \ (Conditional Expectation) 

•k 

Orthonormal rotation of in vector space 


Result of vector space orthonormal rotation in 

• 

probability or event space 

^ k-\-i,k-\-j 

State autocovariance matrix 

P J ' 

ss 

Solution to Discrete Algebraic Lyapunov Equation 

pR 

ss 

Solution to Discrete Algebraic Riccati Equation ( A 
priori steady-state error covariance matrix) 

■p R 

1 ss 

(A posteriori steady-state error covariance matrix) 

Ffc+iifc 

Kalman Gain 

Fss 

Steady-State Kalman Gain 

Vk-\-j\k 

Conditional prediction variance for future output 
value 

c k 

Level-crossing event 

&k+j 

Level-crossing subevent (disjoint) 

E k+j 

Level-crossing subevent (non-disjoint) 

i 

Universe of all events 

Ak 

Optimal alarm event (sublevel set) 

£lc 

Region in vector space spanned by level-crossing 
event 

La 

Level set for optimal alarm event or design thresh- 
old for “redline” and “predictive” methods 


Sublevel set for subevent (used in root-finding 
approximation to optimal alarm event) 

K 

Sublevel set for subevent (used in closed-form 
approximation to optimal alarm event) 


Sublevel set for decomposed subevent (used in 
closed-form approximation to optimal alarm event) 

L a 

Level set for subevent (used in (either) approxima- 
tion to optimal alarm event) 

L 

Critical level 

d 

Prediction horizon 

n 

Border Probability 


Critical Border Probability (Domain Boundary) 

Pi 

Detection Probability 

Pfa 

False Alarm Probability 


TABLE I 

Summary of mathematical notation 


overlapping subevents, E' k+ -. However, due to DeMorgan’s 
theorem, the latter can be expressed in a more compact fashion 
via a single term when computing the probability of the overall 
event. This obviates the need for use of the inclusion/exclusion 
rule for the realization of all relevant terms in a probability 
computation based upon the union of overlapping subevents, 
E'k+j, where the number of terms would be exponential in d. 
It also obviates the need for computing the probability based 
upon the former union of disjoint subevents, Sk+j, where there 
is no need for use of the inclusion/exclusion rule. However, the 
number of terms would still be linear in d, as the probability 
computation of the union of disjoint subevents is represented 
by the sum of terms involving Sk+j . Thus Eqn. 5 represents 
the unconditional probability of the level-crossing event in its 
most compact representational form. 


P(C k ) 



N{yd;iiy d ,Ey d )dyd 


(5) 


where 




IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-XX, NO. X, XXX 2010 


4 



-Process Value 


■ Critical Level - L a 16 

X 

False Alarms 

O 

Correct Detections 

* 

Missed Detections 


Predicted Values 


= {bft+il <-0 = 1 
Ek+ 2 = { |^fc+2 1 < L} = 1 


Ek+B — {|^i+s| < L} — 0 


ft+B={n?-i^ +i ,£i + 5}=i 



40 60 

k, samples 


Fig. 1. Level-Crossing Event Realization 



Uk+ 1 




' 0 ' 

A 




= o d = 


y d = 



My d 



Uk-\-d 




_ 0 _ 

* 

f CP k ( 


+ R 

Mi = j 

e [l, 

y d ~ ' 

l CP k+iyk+j C T 

Mj > * 

e [l, 


and Pk+i,k+j = A^(P,-Pf s )(A T ) i + A^Pf s 

We may approximate S yd as shown in Eqn. 6 by substitut- 
ing the steady-state version of the Lyapunov equation given 
previously as Eqn. 3, P^ s , in place of P/, : , which agrees with 
our assumption of stationarity. 


r CPiC T + R Vi = j£[l,...,d) 
CA^ s C t Vj > * e [1 d] 

This approximation, while it introduces error with regards 
to the probability of a level-crossing event, P(Ck) at a specific 
point in time, k, is ostensibly negligible and will provide for 
a great computational advantage in the design of all alarm 
systems that it is based upon. Instead of designing an optimal 
alarm system for each time step, we design a single alarm 
system based upon the limiting statistics that are reached at 
steady-state, greatly reducing the computational burden. The 
steady-state assumption has not been used in work by Antunes 
et al. [13], but doing so also incurs much greater computational 
effort. 


Theorem 1, which can be found in Appendix I, provides the 
mathematical underpinnings for the optimal alarm condition 
corresponding to the level-crossing event, shown here as 
Eqn. 7. Alternatively, the optimal alarm condition derived in 
Theorem 1 can be expressed in terms of the subevents E k+ j , 
as shown in Eqn. 8. 


P(Ck\yo, ■ ■ ■ ,yk) 

d 


4=> r<n Ek+j\yoi ■ ■ ■ i yk) 

3 = 1 


> P b 

(7) 

< l -P b 

(8) 


The optimal alarm condition has therefore been derived 
from the use of the likelihood ratio resulting in the conditional 
inequality as given in Eqn. 7. This basically says “give alarm 
when the conditional probability of the event, Ck , exceeds the 
level P;,.” Here, p, represents some optimally chosen border or 
threshold probability with respect to a relevant alarm system 
metric. It is necessary to find the alarm regions in order to 
design the alarm system. This alarm region is parameterized 
by future process output predictions and covariances, which 
can be derived from standard Kalman filter Eqns. 9-13. 
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y k \k 

= 

Cx fc , fe 

(9) 

Xfc+l|fc 

= 

-A-Xfc|k 

(10) 

Ffc+i|fc 

A 

P k+ i\kC T (CP k+llk C T + R)- 1 

(11) 

P k+l\k 

= 

AP fc|fcA T + Q 

(12) 

P fc+l|fc+l 

= 

Pfc+i|fc ~ Ffc+iifcCP/o-Kifc 

(13) 

where 




A 

Xfc|/c — 

E[x k \y 0 ,...,y k ] 


P — 

P/c|/c 

E[(x k - Xfc|fc)(x/c - x fc | fc ) T |y 0 , • • • ,y k \ 


Relevant predictions, covariances and cross-covariances are 

given below as 

Eqns. 14- 18, respectively. 


Vk+j\k 

= 

CA-'x fc+J k 

(14) 

P k+j\k 

= 

A j (P fe |fc-Ps S )(A T ) J '+Pf s 

(15) 



A^-P^XA^'+P^ 

(16) 

P k+i,k+j\k 

= 

A J '(Pfc|fc - Pf;)(A T ) i + A’-'pL 

,(17) 



AMP" -P^iA")' • A 'Pt 

(18) 

■p R 
ss 

= 

p R r^Ty R 

r ss r ss'^'± ss 

(19) 

F 

ss 

= 

P«C T (CP*C T + R)~ l 

(20) 


P^, is the combined steady-state version of Eqns. 12 and 
13 given previously, or the discrete algebraic Riccati equation, 
and is the steady-state a posteriori covariance matrix 
given in Eqn. 19. Eqn. 20 is also used in Eqn. 19, which 
is the steady-state version of the Kalman gain from Eqn. 11. 

The approximations shown in Eqns. 16 and 18 will provide 
for a great computational advantage in design of the optimal 
alarm system and its corresponding approximations for reasons 
stated previously. Due to the approximation of P k \k with P^ s 
shown in these equations, the Kalman filter will be suboptimal, 
as cited by Lewis [16]. However, the assumption of stationarity 
is required for the design of an optimal alarm system as defined 
by Theorem 1, and holds here as well. 

A more formal representation of the optimal alarm region 
is shown in Eqn. 21, which essentially defines a sublevel set 

of g($d)=P({X=i Ek +i l2/o, - - - , Vk) as a function of y d . 
d 

A k = {r\yk+i\k-P(Ck\yo,...,yk)>P b } (21) 

i— 1 

d d 

= { 0 \k ' 1^) ^k+j |yo> • • • > Vk) ^ 1 Pb\ 

i= 1 3= 1 

Eqns. 22-23 give the multivariate normal probability com- 
putation to be performed via numerical integration, required 
for enabling the optimal alarm condition. 

d p L p L 

P(f]E k+j \y 0 ,...,y k )= ■■■ N(y d \y d ,t yd )dy d 

J—L J—L 

(22) 


pL— Vk+i\k 
' ~ L— Vk+l\k 


rL—yk+d\i 


where 


-L— Vk+d\k 


Af(yd',Od,t, yd )dy d (23) 


A 

y d = 


E[yd\yo,---,yk] 


iik+i\k 


= 




A 


k-\-i\k 


|_ Vk-\-d\k J 

f Vk-\-j\k Vi = 7 £ [1, . . . , d\ 
{ CP k+itk+j \ k C T [I,...,d] 

CP k+ i\ k C T + R 


The feasible region for values of Pb can easily be de- 
termined by applying an intermediate value theorem from 
calculus which provides sufficient conditions for finding a 
level set solution. The sufficient conditions are shown as Eqns. 
24-25, and the resulting level set is shown as Eqn. 26. 


g(0 d ) > 1 -Pb (24) 

lim g(y d ) < 1 - Pb, Vj G [1, . . . , d\ (25) 

lydlxr/fc+jifc — >oo 

d 

L A ={f] y k +j\k ■ g(yd) = l- Pb} ( 26 ) 

3—1 

The notation that represents the limiting condition shown in 
Eqn. 25 is \y d \\y k +j\k —> oo, and is meant to indicate that all 
elements of y d other than y k +j\ k approach ±oo. Application 
of this condition yields Pb < 1, which is true by definition, 
and application of the sufficient condition shown in Eqn. 24 
yields I), > 1 — ij(Qd)- Thus the feasible region for Pb is 

P b G[l-g(O d ),l}. 

It is not possible to obtain a closed-form representation of 
the parametrization for the optimal alarm region shown in Eqn. 
21. As such, resulting ROC curve statistics can not be com- 
puted analytically by means of numerical integration as will be 
shown to be possible for other methods. As an alternative, we 
must use the Monte Carlo style approach discussed previously. 
This will alow for the ROC curve statistics to be estimated 
empirically with observational and truth data generated from 
the existing model and corresponding simulations of level- 
crossing events. 

However, as will be shown, with the aid of two distinct 
approximations we can generate ROC curve statistics by 
numerically integrating expressions for the computation of 
relevant multivariate normal probabilities. These multivariate 
probability computations are performed by using an adaptation 
of Genz’s algorithm [17], which is based upon a robust and 
computationally efficient technique designed to be used for 
integrations in multiple dimensions for multivariate normal 
distributions. This provides a tool necessary for the design 
of approximations to an optimal alarm system, and also other 
failure detection algorithms such as the one most often used by 
Kerr [18], who specifically cites issues with the computation 
of these types of integrals. As such, we can avoid otherwise 
often very time and computationally intensive simulation runs 
when using Monte-Carlo style empirical estimation. 
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A. Root-finding Approximation 

The optimal alarm region, Ak, can be approximated by 
the alarm region specified by (J^_ 1 H/i,,- Fundamentally, the 
approximation is constructed by solving for asymptotic bounds 
on the exact alarm region. By using asymptotes, we are 
implicitly making a geometrical approximation by forming 
a hyperbox around the alarm region. Simple 2-dimensional 
examples of such hyperboxes for various values of L, and 
I), are shown in Fig. 2. There is visual evidence that limiting 
effects for this approximation exist, as both L and I\ approach 
the extremities of their feasible ranges. These effects will be 
touched on briefly later in the results section, but will be 
investigated in earnest in a sequel article. 


Root finding approximations (at least one exceedance in 2 steps) 
Alarm regions for L = 2 Alarm regions for L = 3 



Alarm regions for L = 4 Alarm regions for L = 5 



Fig. 2. Root-finding approximations for optimal alarm region 

Mathematically, the approximation is formed by solving a 
root-finding problem which yield bounding asymptotes. The 
root-finding problem is posed by first taking the limit as each 
dimension of Eqn. 21 approaches 0, other than the one for 
which the asymptote is being derived. Eqn. 27 expresses this 
limiting condition as a function of the dimension of interest. 

d 

f(Vk+j\k)= lim P(C\E k+j \y 0 ,...,y k ) (27) 

yd\Vk+j\k^o 

Having defined f(y k +j\k )> it is now possible to express f 1a 
by Eqns. 28-29. 


^ Aj {_ilk-\-j\k ■ f (Vk-\-j\k) — 1 (28) 

= {\ilk+j\k\ > L Aj } (29) 

where the root-finding problem is given by numerically 
solving Eqn. 30. 

pAj { ll/fc+j \k I ■ fi^yk-\-j\k) i ^Vl (30) 

Thus the root-finding approximation to the optimal alarm 
region is given by Uj=i ~ A k . Note that the function / 
incorporates all elements of the covariance matrix £ yd when 
computing the asymptotes, just as when constructing the sub- 
level set for the the exact optimal alarm region. Furthermore, 
the feasible region for I), is identical to the sublevel set of the 
exact optimal alarm region, P b £ [1 — <7(0,^) , 1] = [1 — /(0),1] 
by using a similar argument and set of sufficient conditions, 
as shown in Eqns. 31-32 below. 

/( 0) > 1-P b (31) 

, lim f(y k+j \ k ) < 1 - Pb (32) 

\Vk+j\k \— >°o 

However, there is one primary difference between this 
approximation and exact alarm region. As far as the condi- 
tional mean, y,/, is concerned, the asymptotic approximation 
is parameterized only by the corresponding dimension of the 
conditional mean, y k +j\ k - The exact optimal alarm region uses 
all dimensions of the distribution and thus the conditional 
mean, yd, simultaneously. 

It is possible to generate formulae for the true and false 

positive rates as a function of La by appealing to Eqns. 33- 

^ d 

34, where in place of A k its approximation U J= i --.1, may be 
used. 


True positive rate: 



Pd = P(C k \A k ) = 

P(C k ,A k ) 

(33) 

P(A k ) 

False positive rate: 



P fa = P(A k \C k ) = 

P(C'k,A k ) 

nc'k) 

(34) 


P(A k )-P(C kl A k ) 
1 - P(C k ) 


Because we have already introduced the formula for P(C k ) 
in Eqn. 5, which holds regardless of the alarm system being 
used, we must only find expressions for P(C k ,A k ) and 
P(A k ). They are given in Eqns. 35-36, where Pb orit = 1 — 
<?(0d) = 1 — /( 0), and they are also implicitly expressed as a 
function of the design parameter, P b , as a consequence of Eqn. 
30. Note also that the off-diagonal blocks of the covariance 
matrix £ z are equivalent to £ yd as a consequence of the 
projection theorem. 
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P(A k ) = | Pb>Pb C ru (35) 

l i ns n crit 

( l-P(nU Q Aj) P » > P »,ru 
l 1 Pb< P bcrit 

P(r A ) _ / P{Ch) - 44) + P(C' k , A' k ) P b > P bcrit 
( k ’ k) ~{ P{C k ) P h < P hcrU 

(36) 

where 


p(A' k ) =p( n tf Ai ) = p( n itffc+iifci < 

j=i i=i 

= / •••/ Af(yd-,Hy d ,^y d )dy d 

d~ L A i J-L Ad 


and 


v. ^ v _ v 
yd yd ^yd 


= 0 (P^ S — pf s )o T 

CA 


O £ 


CA" 


Furthermore, 


d d 

P(C' k , 4) = P( f) E k+j , f| &' Aj ) 

3=1 3 = 1 


where 


r 

r L A i 

f 

L. 

'-l Ai 

J- 


A 

y d 

z 


y d _ 


A 

dya 


Ayd 

y 

A 

^y d 



. ^yd 


Laj 


Af(z; p z ,T, z )dz 


B. Closed-form Approximation 

The optimal alarm region, A k , can also be approximated by 
an alarm region specified by Uj=i 4> with a successive ap- 
proximation on Ai; Ai is defined in Eqn. 37. Fundamentally, 
the approximation can be constructed in the same fashion as 
the root-finding method, by solving for asymptotic bounds on 
the exact alarm region. 


A\ = {y k +j\k ■ P( E k+j\yo, ■ ■ ■ ,Vk) < 1 - Pb} (37) 

A containment relationship between the exact optimal alarm 
region and the union of inequalities =1 4 s A k can easily 
be shown with a linear transformation of the conditionally 
defined Gaussian vector y d to a vector of independent vari- 
ables. The integrand of Eqn. 23 is a multivariate Gaussian 


density whose conditional covariance matrix is given by £ yrf . 
The orthonormal decomposition of this covariance matrix and 
density of the corresponding transformed vector y d are shown 
in Eqns. 38 - 40. 


y d = Ay d 

(38) 

py d = ATA t 

(39) 

^yd) = Af(yd', 0 d, r) 

(40) 


Here, the elements of y,i are independent, and thus 1 is 
diagonal. As such, geometric containment easily follows when 
considering a revised expression for A k and IJ ,=i 4- Thus, 
the latter approximation to the exact alarm region can be 
rewritten in the transformed probability space as shown in 
Eqn. 41. The superscript * for all probabilities included in 
this expression refers to the transformed values that results 
after the orthonormal rotation. Note that this expression does 
not change significantly from what was given in Eqn. 37. 

d d 

U 4 = U iy^+3\k ■■ p( E k +j \yo, ...,y k )<i-p b *} (4D 

3=1 3=1 

The exact alarm region A k can be rewritten in the trans- 
formed probability space as shown in Eqn. 42, however the 
expression changes significantly, and in such a manner to allow 
for direct comparison to Eqn. 41. 


d d 

A k = { n yk+i\k ■■ p( n E *k +J \yo , . . . , %) < i - pn 

i=l j=l 

d d 

= { n Vk+Ak ■■ n p ( E k +j \vo, ■ • • , vk) < i - p b *} (42) 

i= 1 3=1 

Because containment in this probability space is invariant 
under orthonormal rotations, it follows from Eqns. 41- 42, 
that U?=i 4 C A k , so that the approximate alarm region 
is a proper subset of the exact alarm region. Fig. 3 provides 
illustrative evidence of this containment in the transformed 
probability space when d = 2. Here, the union of the red and 
blue colored sections represents A k (formula shown below) 
and the blue colored section represents the approximation A j. U 

AI 


A k — {(y k +i\ki Vk+ 2 \ k ) '■ P( E k +i\yo, ■ ■ ■ j Vk) • 
P(E* k+2 \yo,..-,y k )<l-P b *} 

A successive approximation is required in order to obtain 
a closed-form representation and parametrization of the alarm 
region without having to resort to root-finding required for 
solving P(E k+ j\yo, . . . , y k ) < 1 — P b , which is equivalent to 
P(\y k +j\ > L\yo , . . . , y k ) > P b . This second approximation 
is given by Eqn. 43, which breaks this condition containing 
an absolute value into constitutive inequalities. 

A l k = {y k +j\ k ■ p( e I+ 3 \yo, - • • , yk) > A} (43) 



IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-XX, NO. X, XXX 2010 


Approximation in Transformed Probability Space 




\ 

1-P b * = 0.75 

1-P* = 0.75 

• . : a 

— i 




Approximation in Probability Space 


1 

0.8 



0 0.2 0.4 0.6 0.8 1 



Fig. 3. Containment of the approximation by exact alarm region 


Fig. 4. Closed-form approximation in probability space 


where 

i G B = {L,U} 

P k+j = {Vk+j < L} 

E k+j = {Vk+j > —L) 

Thus P{E%' +j \yo,...,yk) + p ( E k+j \vo, ■ ■ ■ , Vk) > A is 

approximated by two distinct inequalities given by the union 
of P{E%' +j \y a ,...,y k ) > P b and P{E%' +j \y 0 , . . . , y k ) > P b . 
This subsequent approximation can easily be visualized in Fig. 
4. The union of the red and blue colored sections shown in 
Fig. 4, represents A\. Thus the blue colored section alone from 
Fig. 4 is a subset of this area, such that A pl U A pl C A\. 
If we replicate Fig. 4 for j G [1, . . . , d], then it becomes clear 
that more generally Eqn. 44 holds, which summarizes all of 
the containment relationships for the approximations covered 
in this subsection. 

U U A k j <= U A i^ A * < 44 > 

j=li£B j=l 

By using this successive approximation, we can now repre- 
sent the alarm region in “closed-form,” as shown in Eqn. 45 
below. 

d d 

U U A k J = U \yk+j\k\ > L + yJVk+j\k*& 1 {Pb ) = Laj 

3=1 ieB j=l 

(45) 

4>~ 1 (-) represents the inverse cumulative normal standard 
distribution function, and I/^.Vj G [l,...,d] represent the 
limits of integration. The La j values can now been re -defined 
to replace the integration limits used for the root-finding 
method in Eqns. 33 - 36. As such, these same equations are 
valid for computing Pd and Pj a in order to construct an 
ROC curve using this “closed-form” approximation as well. 
However, in place of .4 when using these equations, the 
approximation UjLi UieB A * s usec f 


The domain of feasibility for this approximation now 
changes, and P bcTit takes on a new value, which differs from 
identical values of P bcrit = 1 - g(0 d ) and P barit = 1 - /( 0) 
corresponding to the feasibility regions for the optimal alarm 
region and the root-finding approximation, respectively. A 
derivation for the new value of P borit is provided in Eqns. 
46-50 below. The derivation is based upon the premise that 
La > 0, where the last step from Eqn. 49 to 50 uses Lemmas 
2-5 which can be found in Appendix I, and the fact that R > 0. 


l A] 

> 

o Vj G [1, . . . 

A 

(46) 

'Vk+j\k*-\Pb) 

> 

o Vj G [1, . . . 

A 

(47) 

d 

Pin 

3=1 

> 

«( - L ' 

) = p p 

(48) 

\ \Z^k+j\k J 

Pbarit 

> 

max P b . 


(49) 




j =P bd 

(50) 



\ l/Vk+d,\k 


Again, by using asymptotes we implicitly make a geomet- 
rical approximation by forming a hyperbox around the alarm 
region. As before, simple 2-dimensional examples of such 
hyperboxes for various values of L, and P b are shown in Fig. 5. 
Furthermore, just as for the root-finding approximation, visual 
evidence that limiting effects for this approximation also exist, 
as both L and P b approach the extremities of their feasible 
ranges. Note that both the approximation represented by Fig. 
3 and the successive approximation represented by Fig. 4 have 
been applied to yield the vector space result shown in Fig. 5. 
Both Figs. 3 and 5 have been illustrated for the case when 
d = 2. 

Due to the containment relationship labeled Eqn. 44, qual- 
itative arguments for the under-reporting of Pd and Pf a can 
be made for this approximation. A less aggressive, more opti- 
mistic strategy will result in comparison to the exact optimal 
method. It is unclear if this approximation will be more or 
less accurate than the previous root-finding approximation. 
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Closed form approximations (at least one exceedance in 2 steps) 


Alarm regions for L = 2 Ala™ regions for L = 3 



Alarm regions for L = 4 Alarm regions for L = 5 



Fig. 5. Closed-form approximations for optimal alarm region 


However, we do know that the off-diagonal elements of the 
covariance matrix E yd are not used for computing the asymp- 
totes of this “closed-form” approximation. Recall that the root- 
finding method incorporates all elements of the covariance 
matrix when computing the asymptotes. Yet both methods use 
asymptotic approximations which are parameterized only by 
the corresponding dimension of the conditional mean, y k +j\ k . 

As is apparent intuitively from Figs. 2 and 5, A k C Cl A ., 
thus y ._i A° k C y i} A . . It is clear from visual comparison 
of these figures that this containment relationship exists be- 
tween the root-finding and “closed-form” approximations. For 
a mathematical proof of this containment, recall Eqns. 28-29 
for Q a , shown again below, and compare them to Eqn. 37 
for Ai, also shown again below. 


LI Aj — }Vk+j\k ■ I i,yk+j\k) — 1 d \ } 

= {|yfc+j|fcl L Aj } 

A k = {Vk+j\k ■ P{E k+j \y 0 ,...,y k ) < 1- P fc } 

If we look closely at the regions of integration for f{y k +j\ k ) 
and P(Ek+j\yo, ■ ■ ■ ,y k ), as shown in Eqns. 51-55 below, we 
will notice that a clear containment relationship exists. 

d 

f(Vk+j\k) = . I™ ^(Pl E k+j\yo,---,Vk) (51) 

yd\yk+j\k^o 


= / M(yd;yd,E ydJ 
rL—Vk+j \k rL 


/ L pL Vu-i-n\lc pL 

■■■ j ■■■ [ Af{yd;yd,^y d )dy d (53) 

-L J — L— Vk+j\k J —L 

P(E k+j \y 0 ,...,y k )= Af(y d ;y d ,t yd )dy d (54) 

JT>a 


/ OO 
-CO 

where 


r L ~Vk+j\ k 
— L—Vk+j\k 


N{y d ;y d , d )dy d (55) 


X = {[-L,f]}cl 

{X X [ L yk+j\ki dj 
Pa ^ [ L y kA -j\k-> L 

It is clear that T>q C T> a due to the fact that X d_1 C K d_1 . 
As such, f(y k+ j\ k ) < P{E k+j \y 0 , ... ,y k ) easily follows due 
to the fact that both expressions share a common integrand. 
It is therefore evident that our original claim A k C f 1 A and 
thus Uj =1 A{ C y^ =1 f \ A . is mathematically sound. 

According to this newly derived containment relationship, 
and by again using qualitative arguments, it is clear that the 
root-finding approximation will be more aggressive, and less 
optimistic than the closed form approximation. However, there 
is no containment relationship that can be established between 
the root-finding method and the exact optimal alarm region as 
could be performed for the closed form approximation. As 
such, even though the root-finding method incorporates all 
elements of the covariance matrix when computing its asymp- 
totes, this approximation strategy may be overly aggressive 
and overshoot the performance of the exact optimal method 
under certain circumstances. This mathematical intuition will 
be supported by demonstrating this effect with examples later 
in the results section. 


C. Redline and Predictive Alarm Systems 

The two baseline alarm systems mentioned previously (red- 
line and predictive) will be compared to the optimal alarm 
system and its approximations. All methods will attempt to 
predict the level-crossing event defined by Eqn. 4. The redline 
alarm system attempts to define an envelope, [—L A ,L A ], 
outside of which an alarm will be triggered to forewarn of the 
impending level-crossing event. The probabilities necessary 
to compute P d and Pf a based upon Eqns. 33-34 for this 
alarm system are provided in Eqns. 56-59, where we re-define 
Ak = {\yk\ > La}, such that the alarm is based only on the 
current process value. 


P(A k ) 


P(C k , A k ) 
P(C' k ,A' k ) 


P{\yk\ > L a ) 


(56) 


24) 


-L a 

CPiC 


R, 


P(C k ) - P(A' k ) + P(C' k: A' k ) 


(57) 

(58) 



)dz (59) 
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where 


IV. Example 


A 

z = 


A 

M z — 


Vk 

y d 

P yk 
My d 

CP^C T 


= 0 


d+1 


ss 


CA 3 


R 

_l P^C T 


V* = j G [0, . . . , d\ 
Vj >i£ [0, . . . , d] 


P(A k ) = 


P[C k , A k ) 
P(C' k ,A ' k ) 
where 


P(|yfc+d|fe| > A) (60) 

2<& (tat) (61) 

P(C fe )-P(A' fc ) + P(^,A' fe ) (62) 

f ■■■ I f A AA(z; Mz ,S z )dz(63) 
J-L J-L J -L a 


A 


A 


Mz — 


Sz = 


yd 

yk-\-d\k 
My d 

M^fc + d | fc 

E- 


= 0 d+ i 


A a 

A„ 


„„ Al 

A a A a 
CA d (Pf s -Pfj(A T ) c 
CA d (Pg S - PfjO T 


The example to be used for the presentation of our results 
has no specific application, but is generic and based upon the 
same example used by Svensson [2], The model parameters 
are provided in Eqns. 64-67. 


The “redline” alarm system is termed as such in order to 
indicate that a simple alarm level crossing is used to predict 
a second more critical level-crossing. In this case two levels 
are used, L as the failure threshold, and La as the design 
threshold. For reasons stated earlier, this alarm system would 
be superior to a redline system that uses only a single level 
L, even though predicted future process values are not used. 

The “predictive” alarm system does incorporate the use of 
predicted future process values, and defines the same envelope, 
[—La, La], outside of which an alarm will be triggered to 
forewarn of the impending level-crossing event. However, the 
alarm definition differs from the redline method, such that 
Ak = {|2/fc+d|fcl > La}- The predicted future process value 
Vk+d\k is found from standard Kalman filter Eqn. 14. The 
probabilities necessary to compute P,i and Pf a based upon 
Eqns. 33-34 for this alarm system are provided in Eqns. 60- 
63. 


A 

C 

Q 

R 


0 1 
—0.9 1.8 
0.5 1 ] 

0 0 
0 1 
0.08 


(64) 

(65) 

(66) 

(67) 


Unless otherwise stated, all cases to be compared will use a 
threshold of L = 16 while varying d, or a prediction window 
of d = 5 while varying L. 

V. Results & Discussion 

A comparison of the AUC for all alarm systems for a 
prediction window of d = 5 while varying L £ [2.89, 17.83] 
is shown in Fig. 6. 

AUC vs. critical threshold 



Optimal 

Closed-form approximation 

Root-finding approximation 

Redline 

Predictive 

Redline Simulated 

Predictive Simulated 


Note that A a and A a have been derived with the aid of the 
projection theorem. All of the alarm systems described thus 
far will be compared using the area under the ROC curve 
(AUC). This provides a performance metric that characterizes 
the ability of each alarm system to accurately predict the level- 
crossing event. The AUC has been deemed as a theoretically 
valid metric for model selection and algorithmic comparison 
[19]. The parameters of interest are La for the redline and 
predictive methods, and f\ for the optimal alarm system 
and its approximations. Results will follow in the subsequent 
section. 


Fig. 6. AUC for all alarm systems a function of critical threshold, L 

It is very clear that the optimal alarm system and its 
approximations outperform the redline and predictive methods, 
over the entire range of values shown for L , as expected. 
Another important point to note is that the approximations 
shown as dashed and dash-dotted blue lines, approximate the 
exact optimal performance (in solid blue) quite well over most 
of the range of values shown for L. However, as L — > 0, 
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the approximation breaks down as evidenced by the notable 
divergence of AUC values. More careful analysis of the 
reasons for this divergence, including its relation to the design 
parameter I), will be presented subsequently. 

The ROC curves for the exact optimal alarm system were 
computed using the Monte-Carlo style simulation described 
earlier, and based upon a computationally efficient method 
documented by Fawcett [20], The corresponding AUC values 
were computed using a trapezoidal integration method pre- 
sented by Bradley [19]. The same simulation-based method 
for generating ROC curve statistics and subsequently the AUC 
value can be used for both the redline and predictive methods, 
as shown in Fig. 6. 

The ROC curve statistics for both approximations to the 
optimal alarm system were generated by the use of Eqns. 
33-34, as were the redline and predictive alarm systems. 
The latter were verified by comparison to the corresponding 
simulation-based AUC values, which matched quite well. The 
use of Genz’s [17] numerical integration technique to compute 
the probabilities given in Eqns. 33-34 are inherently based 
upon Monte Carlo sampling. As such, a fixed number of 
random samples must be chosen to guide the resolution for 
all integrations. All of the results presented in this article use 
3600 as the number of random samples for each integration 
performed. 

The ROC curve statistics in this case were computed in 
a different manner than their simulation-based counterparts. 
The design parameters of interest. La or Pi, were varied 
over their feasible ranges in an adaptive pointwise manner in 
order to construct ROC curves that targeted a fixed resolution. 
However, the AUC values were still computed based upon 
the same trapezoidal integration methods presented by Bradley 
[19] as before. 

It is necessary to ensure that the smoothness of the ROC 
curves and AUC curve as a function of L constructed by using 
these different methods are comparable. We appeal to use of 
the standard error to reconcile the contrasting magnitudes of 
error introduced. It is well known that the Hanley-McNeil 
method [19] for estimating the standard error of the AUC 
yields confidence bounds that are often too conservative and 
excessively wide. As such, a bootstrap resampling method was 
used to form confidence bounds for the AUC values resulting 
from the application of Eqns. 33-34 to construct corresponding 
ROC curves. 

The resulting SE(AUC) values were then subsequently 
used to guide establishing a partial termination criterion for 
the ROC curves constructed via simulation. This provides 
some assurance that the smoothness of the ROC and AUC 
curves using the two different methods are comparable. An 
additional termination criterion to complement the SE(AUC) 
value criterion is to use the fixed resolution targeted for 
construction of the ROC curve as before. Since we now 
have some assurance of comparability of smoothness of the 
AUC curve and resolution of the ROC curve, the issue of 
computational complexity can be addressed. Table II provides 
a summary of the empirically generated timing tests which 
illustrate both off-line design-time and on-line run-time com- 
putational complexity. 


TABLE II 

Empirical Analysis of Computational Complexity 



Mean Design-Time 

Mean Run-Time 

Optimal 

81 min 

9.5 msec 

Closed-form 

48.5 sec 

0.15 msec 

Root-finding 

57.3 sec 

0.12 msec 


The second column of Table II includes the mean design 
time of both the redline and predictive alarm systems as well as 
the optimal system or its approximations across all values of L. 
Clearly, there is an order of magnitude greater computational 
burden by using the simulation-based method of designing 
alarm systems. Also, as expected the mean design-time for 
the root-finding approximation exceeds that of the closed-form 
approximation. As is clear by Fig. 6, there is no great loss 
in accuracy by using these approximations except for small 
values of L, where there is a perceptible, but perhaps still 
negligible loss. 

The third column of Table II provides the mean run-time 
across all values of L, where it is evident again that the 
computational requirements of the optimal alarm condition 
exceed those of its approximations. In this case, the approxi- 
mations involve only the time for limit checking of the type 
governed by Eqn. 45. Thus the actual time for root- finding is 
not included in the reported time for that approximation as 
shown in Table II, which might account for the fact that it is 
on par with the time for the closed form approximation. The 
mean run-time for checking the exact optimal alarm condition 
is based upon computing Eqns. 22-23, which naturally requires 
more time than a simple limit check. 

Note that we have summarized these empirical complexity 
results in a tabular rather a graphical fashion, aggregating the 
results by taking the mean over all values of L. The main 
reason for doing so is that there is no perceptible trend across 
L for any of the cases, with the possible exception of the 
design time for both the closed form and root finding approx- 
imations. For these exceptions, there is a general upward trend 
of the design time (which again include design times for both 
the redline and predictive alarm systems) as L increases. This 
effect is intuitive because it becomes more difficult to construct 
an ROC curve for low probability events (higher L) that have 
the same target resolution as higher probability events (lower 
L ), when employing numerical integration. 

It is also of interest to investigate the case when using a 
fixed threshold of L = 16 while varying d £ [2, . . . , 24]. A 
comparison of the AUC for all alarm systems for this case 
is shown in Fig. 7. As is clear from Fig. 7 and corroborated 
by Fig. 6, the optimal alarm system and its approximations 
outperform the redline and predictive methods as before, this 
time over the entire range of values shown for d. Furthermore, 
as the prediction window increases, the predictive performance 
as characterized by the AUC decreases for all alarm systems, 
as is to be expected. A more detailed study on the limiting 
effects of AUC as d — > oo will be conducted in a sequel 
article. Due to the use of a modestly large fixed threshold of 
L = 16 however, there are no deleterious effects as a result 
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AUC vs. prediction horizon 



Optimal 

Closed-form approximation 

Root-finding approximation 

Redline 

Predictive 

Redline Simulated 

Predictive Simulated 


Fig. 7. AUC for all alarm systems as a function of prediction window, d 

of using approximations to the optimal alarm system as were 
found when investigating the case when varying L to small 
values. 

Characterization of complexity as d increases is also of 
interest. For the most part, the results are very similar to what 
was presented in Table II for the case in which a prediction 
window of d = 5 was used while varying L. Specifically, the 
mean design time for the exact optimal alarm system (along 
with redline and predictive alarm systems) was on par with 
what was shown in Table II (74 min in lieu of 81 min). 
However, the run-time in this case increases linearly as shown 
in Fig. 8. 


Real-Time Implementation Burden 
for Exact Optimal Alarm System 



Fig. 8. Empirical run-time complexity as a function of prediction window 

As the prediction window increases, the runtime for check- 
ing the exact optimal alarm condition based upon computing 


Eqns. 22-23 naturally requires more time for larger prediction 
horizons. A key advantage in using approximations can there- 
fore be realized. For both the closed form and root finding 
approximations, the mean runtime is exactly on par with what 
was presented in Table II for the case in which a prediction 
window of d = 5 was used while varying L (averaging 0.11 
msec). This is primarily due to the fact that, again, runtime 
for the approximations involve only limit checking of the type 
governed by Eqn. 45. 

As for the design time of the approximations, they too 
exhibit similar characteristics to what was presented and 
discussed in conjunction with Table II. Specifically, there is a 
general upward trend of the design time (which again include 
design times for both the redline and predictive alarm systems) 
as d increases. The mean design times are moderately higher 
than what was presented in Table II (111 sec in lieu of 44.2 
sec for the closed-form approximation and 129 sec in lieu of 
55.2 sec for the root-finding approximation). 

Now we return to addressing the issue of the limitations of 
using the optimal alarm approximations, which break down 
as L — » 0. A notable divergence of AUC values was evident 
in Fig. 6 under this condition. We may gain insight into the 
reasons for this divergence by examining a candidate ROC 
curve corresponding to a small value of L. In Fig. 9, we 
can visually discern how both approximations break down as 
related to the design parameter for a small value of L ss 4 
compared to a larger value of L ss 10. 

There are many observations which can be made about 
Fig. 9. The topmost panels of the figure illustrate ROC 
curves corresponding to the different values of L. It is clear 
that appealing to different methods of constructing the ROC 
curves for the predictive and redline optimal alarm systems 
yield almost identical results. This also serves to verify the 
correctness and equivalence of using either method of ROC 
curve construction for these alarm systems. They manifest a 
reasonably similar level of resolution and smoothness due to 
proper choice of termination criteria. 

However, for the optimal alarm system in solid blue, the two 
approximations shown as dashed and doted blue lines yield 
ROC curves that are close but not identical to the exact optimal 
result when L « 4. On the top right panel when L « 10, 
the ROC curve approximations appear to be much closer than 
on the top left panel where L ~ 4. This substantiates a 
previous observation made from Fig. 6, that as L decreases, 
the approximation loses its accuracy. Furthermore, from the 
previous section. Figs. 2 and 5 showed the optimal alarm 
regions and their approximations to provide further evidence 
of this loss of accuracy as L decreases. Those figures were 
based upon the same example used to generate the results 
presented in this section. 

Further insight can be gained by inspecting the bottom two 
panels of Fig. 9 as well. Note that the bottom panels show 
the missed detection and false positive rates as a function 
of Pi,. The complement of the former is the true positive 
rate, which along with the false positive rate, is used to 
construct the ROC curves shown on the top panels. There are 
a few important observations to be made in regards to these 
bottom panels. First, the closed form approximations to the 
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ROC Curve Approximation for L = 3.91 92 



Optimal 

Closed form approximation 

Root finding approximation 

Redline 

Predictive 

Redline Simulation 

Predictive Simulation 


ROC Curve Approximation for L = 9.9293 



OC Curve Approximation for L = 3.91 92 



Optimal Missed detection rate 

Missed Detection 

(closed form approximation) 

Missed Detection 

(root finding approximation) 

Optimal False positive rate 

__False positive 

(closed form approximation) 
False positive 

(root finding approximation) 


OC Curve Approximation for L = 9.9293 



Fig. 9. ROC Curves and supporting statistics for all alarm systems, demonstrating negligible loss in accuracy for both approximations, and superiority of 
root-finding approximation over closed-form approximation 


optimal alarm system yield true and false positive rates that 
are systematically underreported for both values of L shown. 
This corroborates the mathematical observation made from 
the previous section based upon the containment relationship 
of the closed form approximation to the exact optimal alarm 
region, (jf=i U; g b £ Uj=i A l - A k- For the smaller 
value of L « 4, this underreporting of the true and false 
positive rates is even more striking than for the larger value 
of L « 10. 

Furthermore, the root finding approximations to the optimal 
alarm system yield true and false positive rates that are 
overreported for both values of L shown. This is much more 
clear for the smaller value of L sa 4 than for the larger value 
of L ss 10. Hence, again this corroborates an inference made 
from mathematical observations made in the previous section. 
Recall the containment relationship between the root finding 
and closed form approximation to the exact optimal alarm 
region (Jj=i A t — U /=i ■ It was suggested that the root 

finding approximation strategy may be overly aggressive and 
overshoot the performance of the exact optimal method under 
certain circumstances. This is clear for the smaller value of 
L»4. 

There is one last important note about the root finding 
approximations that is evident in the bottom two panels of 
Fig. 9. The feasible range of values for Pf, is identical to 


the exact optimal alarm region of feasibility, which was also 
proven mathematically in the previous section. The same is not 
true for the closed form approximation, where the region of 
feasibility is clearly different, and drastically so for the smaller 
value of L ss 4. 

Finally, it is evident that the underreporting of true and false 
positive rates as demonstrated in the bottom two panels of Fig. 
9 does not translate to the same visually striking disparities 
for the ROC curves on the top two panels. These striking 
disparities are obfuscated by the fact that the ROC curve is 
a parametric function of the design parameter. As such, great 
caution should be taken when using the ROC curve as the 
sole basis for the design of alarm systems based up the given 
approximations. Specific criteria for the design of an alarm 
system based upon provisions for maximum allowable false 
positive or missed detection rates may be given. With these 
constraints, the supporting statistics as shown on the bottom 
panels of Fig. 9 should be used to complement design based 
upon the ROC curve. 

VI. Conclusions & Future Work 

In this article we have introduced a novel state-space 
approach to the optimal alarm systems literature, and hope to 
have also participated in the Kalman filter-based fault detection 
literature discussion from a different theoretical angle as well. 


IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-XX, NO. X, XXX 2010 


14 


In doing so, we have demonstrated that there is a negligible 
loss in overall accuracy when using approximations to the 
theoretically optimal predictor for a stationary linear Gaussian 
process, at the advantage of greatly reduced computational 
complexity. The negligibility of the loss in accuracy was 
demonstrated by comparing approximations to the optimal 
level-crossing predictor to two competing methods which were 
clearly outperformed over various ranges for both L and d. 
However, care should be taken when designing alarm systems 
for which level-crossing events are defined with small values 
of L. Specifically, when using approximations, alarm system 
design should be governed both by ROC curve analysis as well 
as supporting false positive or missed detection rate statistics 
parameterized by the design parameter /',. 

In future work, we will investigate limiting effects of 
AUC for the closed-form approximation introduced in this 
article. Specifically, limiting values for relevant statistics as 
Pb, L, R, and d approach the extremities of their feasible 
ranges will be examined. In doing so we hope to facilitate 
a new and broader context for the design of an optimal 
alarm system as related more to important engineering design 
parameters. Furthermore, we aim to the investigate control 
theoretic implications and ramifications of using the Kalman 
filter in tandem with optimal alarm theory that naturally 
follow. Here it will also be possible to gain further insight 
into important engineering design considerations for both the 
analysis and synthesis of algorithms used for mitigation of 
potential failures from a practical standpoint. Relaxing some of 
the inherent assumptions made in this article to the point where 
non-parametric methods such as particle filtering may also 
provide a natural vehicle for the extension of optimal alarm 
theory to other practical research domains. Finally, extension 
of this work to systems containing both multivariate inputs and 
outputs is important, and has practical appeal to the field of 
data mining. As such, scalability and complexity will remain 
important considerations. 


Appendix I 

Theorems and Lemmas 

Theorem 1: From Eqns. 1-3 it is clear that successive output 
values of the stationary stochastic process, y k admit a well- 
defined jointly Gaussian probability density function. Also, the 
level-crossing event, C k , defined through Eqn. 4, represents 
at least one exceedance outside of the threshold envelope 
specified by [-L, L] of the process y k . Then the optimal level- 
crossing predictor can be written as P(C k \yo, ■ ■ ■ ,y k ) > Pb, 
where the condition for optimality is as specified and defined 
by the use of the likelihood ratio criterion, shown in Eqn. 68 
as a result of the Neyman-Pearson Lemma, shown by DeMare 
[5], and more explicitly by Lindgren [6], [21], 

< A (68) 

vyyo, ■ ■ ■ ,yk\Ck) 

Proof: Using Lemma 1 (which curiously appears very 
much like Bayes’ rule, but can be distinguished from it due 
to the use of both probabilities and density functions), we can 
rewrite Eqn. 68 as follows: 

p{yo, ■ • -,yk\c' k ) 
p(yo, ■ ■ ■ ,yk\C k ) 

P(C k \yo,...,yk)Mimr^VkJ 

p (C k ) 

P(C k \yo,...,y k )^iu%r^VkT 
P(C k ) 

P(C' k \y 0 ,...,y k )P(C k ) 
P(C k \yo,...,y k )P(C' k ) 

However, due to the assumption of stationarity of the 
process, the size of the alarm region, P(C k ), associated with 
the uniformly most powerful test of the hypothesis Hq is by 
definition a constant value. The hypothesis being tested in this 
case is of the level-crossing event, Ck- Due to the size of alarm 
region being fixed, we can define new constants as shown 
below. 


< A 

< A 

< A 
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l — P{Ck\yo, ■ ■ ■ i Vk) 

P(C k \y 0 ,---,yk ) 


P(Ck\yo, ■ ■ ■ ,yk ) 
<=> P(Ck\yo, ■ ■ ■ , y k ) 


- P(c k ) 7 

1 =p b 


> 


1 + 7 
> Pb 


Lemma 1: 


p(Vo,- ■ ■ ,Vk\C k ) 


Proof: 


p(yo, ■ ■ ■ ,yk\c k ) 


P{C k \yo,---,y k )p{yo,---,y k ) 

P(C k ) 


( 69 ) 


a / • • • fn c P(yo, yk+d)dy d 
I>(C k ) 

p{yo,---,y k ) 

p{yo,...,yk) 

P(C k ) 

P(C k \y 0 , . . . ,y k )p(yo, ■ ■ ■ ,yk) 
P{C k ) 
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A 


where by definition P(C k \y 0 , ■ ■ ■ ,y k )= 


■■ p(yk+u :<■■■, Vk+d\yo, • • • , Vk)dy d 
J J Q. c 

I ■ ■ ■ J n c p(y<h ■ ■ ■ ’yk+d)dy d 

p(yo,---,yk ) 

d 
A 


and fl c = {y<i G : C k = [J S k+j } 

i = i 


Lemma 2: 


P b , = max Ph 


Lemma 4: R > 0 ^ V>R 
Proof: It is true that 


P" >- P 

ss — 


R 


R> 0 <=> R >0 

Under the condition that C G M lxn , where n > 1, with no 
rank condition on C, Lemma 5 can be used to support the 
following implication: 

R- 1 > 0 => C T J R" 1 C >: 0 

Also, given the matrix inversion lemma applied to Eqn. 19 
shown below, the subsequent series of equations proves that 

pi? pi? 


Vk+j+l\k P Vk+j\ki Vj G [1 C?] 

■pi? 

ss 

Pss - Pf s C T (CPf s C T + R)- 1 CPf s 

Proof: The posited claim is true iff 

MJ_.L 

[(PfJ^ + C r i2" 1 C]~ 1 

P bl <...< P bj < P bj+i <...<P bd 
More compactly. 

ii 

7 

<0^ 

(Pss) _1 + c t r~ 1 c 

Pb :j < A, +1 , Vj G [1, . . . , d] 


c T f?- 1 c v o 

The following chain of inequalities is true Vj G [1, . . . , d\. 

(P?J- 

'-(Pfs) -1 b 0 

(Pfj- 1 V (Pfs) -1 

Pbj < Pb j + 1 


pi? ^ pi? 

* _1 (A,) < s-'OVJ 


m 


-L 


k/c+j+l | k 


> 


-L 


^(Pbr) 

V Vk+j\k 


Lemma 5: Given L G R" xd , for which d > n and there 
exists no rank condition on L: M >- 0 => L t ML A o 
Proof- 


Lemma 3: 


pi? pi? 

ss — r ss 


Vk+j-\-l\k -- > Vk-\-j\k i Yj ^ [1, . . • , d\ 


Proof: 


pi? 

ss 

y 

_ -pi? 
ss 

y 

Pfs)x 

> 

Pfjx 

< 


M 

y 

0 

x t Mx 

> 

0,VxGR n 

X 

A 

Ly 

NullfL) 

A 

{y : Ly = 0} 

dim Null (If) 

> 

d — n > 0 

3y 


O 

II 

>5 

hi 

y T L T MLy 

> 

0,VxGR n 

l t ml 

y 

0 


x T (P*+Q-Pfjx < x t Qx, VxG 


By using the steady-state version of Eqn. 12 and the discrete 
algebraic Lyapunov equation we now have the following, Vx G 

r. 

x T (P« -AP«A t )x < x T (P{' s — AP{ s A t )x 
x T (Pf s -Pf'Jx < x T A(Pf s -Pf s )A T x 

Let x t =CA{ Vj G [1, . . . , d\, and add CP{' S C T + R 
to both sides of the inequality above. It then follows that the 
following relations hold true, Vj G [1, . . . , d\. 

CPfc+j|fcC T + R < CP k+J+Mk C T + R 
Pfe+j+i|fe > Pfc+j|fe 
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